Optimizing Checkpointing Performance in Spark

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizing Shuffle Performance in Spark

Spark [6] is a cluster framework that performs in-memory computing, with the goal of outperforming disk-based engines like Hadoop [2]. As with other distributed data processing platforms, it is common to collect data in a manyto-many fashion, a stage traditionally known as the shuffle phase. In Spark, many sources of inefficiency exist in the shuffle phase that, once addressed, potentially prom...

متن کامل

Optimizing VM Checkpointing for Restore Performance in VMware ESXi

Irene Zhang started her presentation by explaining that checkpointing is similar to “suspend,” but while taking a checkpoint, the virtual machine can continue its execution. Because checkpointing is used for fault tolerance, taking a checkpoint can be done quickly, in less than a few seconds; restoring from the checkpoint, although slow, hasn’t been a problem; however, recent applications, such...

متن کامل

Memory Exclusion: Optimizing the Performance of Checkpointing Systems

Checkpointing systems are a convenient way for users to make their programs fault-tolerant by intermittently saving program state to disk, and restoring that state following a failure. The main concern with checkpointing is the overhead that it adds to running time of the program. This paper describes memory exclusion, an important class of optimizations that reduce the overhead of checkpointin...

متن کامل

The Performance of Consistent Checkpointing

Consistent checkpointing provides transparent fault tol erance for long running distributed applications In this paper we describe performance measurements of an im plementation of consistent checkpointing Our measure ments show that consistent checkpointing performs re markably well We executed eight compute intensive dis tributed applications on a network of diskless Sun workstations comparin...

متن کامل

Checkpointing Orchestration for Performance Improvement

Checkpointing is a mostly used mechanism for supporting fault tolerance of high performance computing (HPC), but notorious in its expensive disk access. Parallel file systems such as Lustre, GPFS, PVFS are widely deployed on super computers to provide fast I/O bandwidth for general data-intensive applications. However, the unique feature of checkpointing makes it impossible to benefit from the ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: DEStech Transactions on Computer Science and Engineering

سال: 2017

ISSN: 2475-8841

DOI: 10.12783/dtcse/csma2017/17315